Prior Information Based Bayesian Infinite Mixture Model

نویسندگان

  • Zhen Hu
  • Siva Sivaganesan
  • Mario Medvedovic
چکیده

Unsupervised learning methods have been tremendously successful in extracting knowledge from genomics data generated by high throughput experimental assays. However, analysis of each dataset in isolation without incorporating potentially informative prior knowledge is limiting the utility of such procedures. Here we present a novel probabilistic model and computational algorithm for semi-supervised learning from genomics data. The probabilistic model is an extension of the Bayesian semiparametric Gaussian Infinite Mixture Model (GIMM) and training of model parameters is performed using Markov Chain Monte Carl algorithm. The utility of the procedure in improving precision of cluster analysis by incorporating prior information is demonstrated in a simulation study and the analysis of the real world genomics data.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On Bayesian Mixture Credibility

We introduce a class of Bayesian infinite mixture models first introduced by Lo (1984) to determine the credibility premium for a non-homogeneous insurance portfolio. The Bayesian infinite mixture models provide us with much flexibility in the specification of the claim distribution. We employ the sampling scheme based on a weighted Chinese restaurant process introduced in Lo et al. (1996) to e...

متن کامل

Infinite models for speaker clustering

In this paper we propose the use of infinite models for the clustering of speakers. Speaker segmentation is obtained trough a Dirichlet Process Mixture (DPM) model which can be interpreted as a flexible model with an infinite a priori number of components. Learning is based on a Variational Bayesian approximation of the infinite sequence. DPM model is compared with fixed prior systems learned b...

متن کامل

Supplemental Information Bayesian context-specific infinite mixture model for clustering of gene expression profiles across diverse microarray datasets

OUTLINE: 1. Additional ROC curves for the simulation study 2. Patterns of gene expression based on the joint analysis of cell cycle and sporulation data. 3. Patterns of gene expression based on the analysis of individual datasets (cell cycle and sporulation) separately. 4. Prior and posterior conditional probability distributions in the context-specific infinite mixture model. 5. Dynamic anneal...

متن کامل

Location Reparameterization and Default Priors for Statistical Analysis

This paper develops default priors for Bayesian analysis that reproduce familiar frequentist and Bayesian analyses for models that are exponential or location. For the vector parameter case there is an information adjustment that avoids the Bayesian marginalization paradoxes and properly targets the prior on the parameter of interest thus adjusting for any complicating nonlinearity the details ...

متن کامل

Bayesian non-parametric parsimonious clustering

This paper proposes a new Bayesian non-parametric approach for clustering. It relies on an infinite Gaussian mixture model with a Chinese Restaurant Process (CRP) prior, and an eigenvalue decomposition of the covariance matrix of each cluster. The CRP prior allows to control the model complexity in a principled way and to automatically learn the number of clusters. The covariance matrix decompo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010